Feature-Rich Statistical Translation of Noun Phrases
نویسندگان
چکیده
We define noun phrase translation as a subtask of machine translation. This enables us to build a dedicated noun phrase translation subsystem that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features. We achieved 65.5% translation accuracy in a German-English translation task vs. 53.2% with IBM Model 4.
منابع مشابه
Experiments with a Noun-Phrase driven Statistical Machine Translation System
This paper presents a noun phrase driven two-level statistical machine translation system. Noun phrases (NPs) are used as the unit of decomposition to build a two level hierarchy of phrases. English noun phrases are identified using a parser. The corresponding translations are induced using a statistical word alignment model. Identified noun phrase pairs in the training corpus are replaced with...
متن کاملDefinite noun phrases in statistical machine translation into Scandinavian languages
The Scandinavian languages have an unusual structure of definite noun phrases (NPs), with a noun suffix as one possibility of expressing definiteness, which is problematic for statistical machine translation from languages with different NP structures. We show that translation can be improved by simple source side transformations of definite NPs, for translation from English and Italian, into D...
متن کاملParaphrasing of Swedish Compound Nouns
The goal for this project is to examine and evaluate the effect of paraphrasing noun-noun compounds, with the aim of improving machine translation. The paraphrases will elicit the underlying relationship that holds between the compounding nouns, with the use of prepositional and verb phrases. Though some types of noun-noun compounds are too lexicalized, or have some other qualities that make th...
متن کاملParaphrasing Swedish Compound Nouns in Machine Translation
This paper examines the effect of paraphrasing noun-noun compounds in statistical machine translation from Swedish to English. The paraphrases are meant to elicit the underlying relationship that holds between the compounding nouns, with the use of prepositional and verb phrases. Though some types of noun-noun compounds are too lexicalized, or have some other qualities that make them unsuitable...
متن کاملHandling Multiword Expressions in Phrase-Based Statistical Machine Translation
Preprocessing of the parallel corpus plays an important role in improving the performance of a phrase-based statistical machine translation (PB-SMT). In this paper, we propose a frame work in which predefined information of Multiword Expressions (MWEs) can boost the performance of PB-SMT. We preprocess the parallel corpus to identify Noun-noun MWEs, reduplicated phrases, complex predicates and ...
متن کامل